Simbed: Similarity-Based Embedding

نویسندگان

  • John Aldo Lee
  • Michel Verleysen
چکیده

Simbed, standing for similarity-based embedding, is a new method of embedding high-dimensional data. It relies on the preservation of pairwise similarities rather than distances. In this respect, Simbed can be related to other techniques such as stochastic neighbor embedding and its variants. A connection with curvilinear component analysis is also pointed out. Simbed differs from these methods by the way similarities are defined and compared in both the data and embedding spaces. In particular, similarities in Simbed can account for the phenomenon of norm concentration that occurs in high-dimensional spaces. This feature is shown to reinforce the advantage of Simbed over other embedding techniques in experiments with a face database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Difficulty of Processing Japanese and Korean Center- embedding Constructions

This research investigates the effects of syntactic, semantic, and morphophonemic similarity in the processing of center-embedding constructions in Japanese and Korean. Six Japanese experiments study the effects of syntactic and semantic similarity, while one Korean experiment deals with the effects of syntactic and morphophonemic similarity. The results from these experiments support the view ...

متن کامل

Uncertainty in Neural Network Word Embedding: Exploration of Threshold for Similarity

Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. However, it is not clear to which extent this similarity value can be genuinely meaningful and useful for subsequent tasks. We explore how the similarity score obtained from the models is really indicative of term relatedness. We first observe and quantify the uncertainty factor of...

متن کامل

Model Based Method for Determining the Minimum Embedding Dimension from Solar Activity Chaotic Time Series

Predicting future behavior of chaotic time series system is a challenging area in the literature of nonlinear systems. The prediction's accuracy of chaotic time series is extremely dependent on the model and the learning algorithm. On the other hand the cyclic solar activity as one of the natural chaotic systems has significant effects on earth, climate, satellites and space missions. Several m...

متن کامل

Heterogeneous Information Network Embedding for Meta Path based Proximity

A network embedding is a representation of a large graph in a lowdimensional space, where vertices are modeled as vectors. The objective of a good embedding is to preserve the proximity (i.e., similarity) between vertices in the original graph. This way, typical search and mining methods (e.g., similarity search, kNN retrieval, classification, clustering) can be applied in the embedded space wi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009